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STATISTICAL UNITS AS STANDARDS.* 

By Horace Secrist, Ph. D., Associate Professor of Economics and 
Statistics, Northwestern University. 



I. INTRODUCTION. 

The requirement that statistics shall be comparable and 
the employment of statistical methods scientific is no less im- 
portant, although it is undoubtedly more urgent, in times of 
war than of peace. We may hope for an end to war but we 
cannot expect the demands of statistical usage to be any less 
exacting. Never before has there been the same need as at 
present for an evaluation of accepted institutions, beliefs and 
methods, and for an appraisal of the r61e which statistical 
science is to play in the solution of world problems. Approxi- 
mations, loose thinking, false judgments, crude comparisons, 
the mistaking of cause for effect, etc., because of ignorance, 
prejudice or a wilful desire to deceive, seem forever to be con- 
demned in the searching criticism of realities which has come 
to us with the war. 

Few fields of public or private activity seem to have escaped 
the demand for the creation of new standards. In the so- 
called scientific world, the slow processes of adjustment to 
new and changing conditions seem recently to have been greatly 
accelerated. In the business world where standards of meas- 
urements, uses, activities, etc., have not already been fixed 
and installed, either competition or state regulation is forcing 
their adoption. The scientific approach to economic and 
social problems seems to have caught public attention. Ac- 
counting has had its meteoric rise during the last decade and 
its cost aspects are rapidly coming into their own. The aim 
is clearly the introduction of scientific method, the appraisal 
of differences and similarities, the determination of cause and 
effect, all with the purpose of adjusting the processes of private 
and public business to the particular demands and needs of 
time and place. 

The development of statistical methods in the interpreta- 

• Paper read at the seventy-ninth Annual Meeting of The American Statistical Association. 
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tion of biological phenomena has been rapid. In the fields 
of business and social science, however, the growth has been 
slower, and of a more uncertain and unscientific type. Only 
recently has the popular assumption been in part dispelled 
that by "statistics" one can "prove anything." Even now 
the position of statistical methods is not secure nor are the 
uses to which statistics are put above serious criticism outside 
the laboratories and research fields of statistical students, the 
statistical departments of some of the more advanced govern- 
ment bureaus, and the more progressive private businesses. 
Statistical surveys and even statistical departments, in private 
and public business, are common, but that statistics are more 
than records of past activities,— collected not because of their 
relationship to future policy but rather because they are " com- 
parable" with those already at hand — and that they may be 
made to supplement accounting in the formulation of rules 
and principles for future guidance, unfortunately have not 
become generally felt. Their present position is similar to 
that occupied by accounting ten years ago. People are not 
completely nor universally converted to the wisdom of their 
use, nor are they fully cognizant of the extent of their applica- 
tion. 

The prejudice against both statistics and statistical methods, 
in part at least, is due to the following tendencies : 

(1) To accept without serious question a plausible de- 
scription of a given condition or state of affairs. Ipse dixit 
is often regarded as sufficient proof. The mere fact of sta- 
tistics appearing in print, and particularly of their being in 
tabulated or graphic form — the finality of a statistical table 
or graph is often magical — is frequently sufficient to insure 
their value and to guarantee their application. 

(2) To employ statistical data without knowledge of, or 
regard for the units of measurements in which they are ex- 
pressed, or their comparability or representativeness, and to 
draw conclusions from them which they were never intended to 
support. 

(3) To disregard detail, or to regard it as "detail" which 
somehow will take care of itself and needs no especial atten- 
tion, to ignore statistical cautions respecting the collection of 
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data or the use of those already collected, to speak in terms of 
statistical abbreviations, averages of all types, to employ 
totals as if they were always more sacred and inviolate than 
the items which go to make them up, and to piece together 
statistical fragments, gleaned from different sources and com- 
piled under widely different circumstances, into a beautiful 
mosaic which thoroughly proves or disproves a contention 
already held. 

(4) To fail to formulate the purposes of statistical studies, 
to outline appropriate methods in order to serve the ends 
desired, to define with precision the units employed in the 
measurements, and rigidly to limit the field to be covered. 

Statistics do not answer questions nor support conclusions 
independently of those who manipulate them. Judgment, 
candor, and integrity in their use are necessary at every step. 
The scientific development of statistical methods^depends not 
only upon these but also upon a full realization of the mean- 
ing and function of units of measurements. It is statistical 
units as standards with which I shall deal in this brief paper. 

II. UNITS AND STATISTICAL METHODS. 

The statistical approach is numerical. Things, attributes, 
and conditions are counted, divided, subdivided, totalled and 
combined. Statistics are in large measure synthetic. They 
deal with aggregates, rather than single instances or rare oc- 
currences. These, however, relate to units of measurements 
characteristic of things or conditions studied and apply to 
definite uses. It is not 1,000 as an abstract unit of frequency 
which is considered, but 1,000 farms, industrial establishments, 
loans, mortgages, etc. Numbers as abstract units may be 
combined, separated and divided because they are homoge- 
neous, the more or less merely indicating presence or absence 
of a condition represented abstractly. But this is not true of 
units of measurements dealt with in statistics. The physical 
measurements of the unit "ton-mile," for instance, remain 
constant; but the qualities of the unit vary with each purpose 
for which it is used. A ton is invariably a ton and a mile a 
mile, but all tons, except as to 'weight, are not the same, nor 
are all miles, except as to length, equivalent. The problem 
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of enumeration is not so much that of counting units describ- 
ing different degrees of intensity, abundance or absence of the 
same thing, as it is of counting different things which have been 
given the same general name. Things which are equal to each 
other in name are often not so in use. Standardization implies 
homogeneity; it suggests conformity and suitability to condi- 
tions determined in the light of particular application. The 
meaning of a statistical unit is a function of the use to which 
it is put. An illustration will give point to this contention. 

It is desired to determine the industrial accident rate in 
a given industry as a basis for fixing a scale of compensation. 
What is an accident? The reason for compensation is the 
consequences of personal injury and it is the character of the 
injury which serves as a basis for enumeration. All injuries 
involving a loss of any time howsoever slight might be thought 
worthy of inclusion. But since compensation is the occasion 
for the determination of the number, only those injuries should 
be included which cause an appreciable loss of time. What 
is an appreciable loss of time? To an individual who expe- 
rienced the loss, it might be any time, howsoever slight. To 
the employer, however, who advances the compensation, and 
to the public who finally bears it, a period of one or two weeks 
might be thought to be the minimum compensable period. 
But many trifling accidents may occasion a far greater loss of 
time than a single or a few serious ones. There would be no 
hesitancy about counting the serious, yet there might be re- 
specting the minor ones. But it is precisely the latter which 
frequently can most easily be prevented, and about which 
information may be desired, since precautionary measures 
involving little added cost to the employer, increased efficiency 
to the employee, and the gradual elimination of the occasion 
for compensation, may be taken for their eradication. 

Moreover, only industrial accidents are to be compensated. 
Self-inflicted injuries as well as those occurring to workmen 
while not engaged in industrial operations, and when work 
done is not a proximate cause of injury, should be eliminated, 
when accidents are enumerated for this purpose. Moreover, 
is disease contracted directly as a result of the conditions of 
industry an accident? Surely it is an "injury," and if injury 
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is the basis of compensation, ought not disease of this type 
to be counted in determining upon a reasonable basis? If 
disease contracted directly as a condition of employment is 
counted as an industrial injury (not "accidental," but charac- 
teristic or regular), how should instances involving impair- 
ment of health, mental or physical ability, be considered? 
How long a period must elapse before a condition, the result 
of employment, ceases to be checked against such employment? 
What is an industrial accident for compensation purposes? 

On the other hand, if the purpose of enumerating industrial 
accidents were to measure the amount of time lost through 
mental or physical injury, obviously, all accidents and all 
diseases directly attributable to industry should be included. 
If the purpose were solely to secure information as a basis for 
removing the conditions causing accidents, or for assigning 
responsibility for them as between employer and employee, 
machine and injured person, those which were trivial, from 
the point of view of the individual, would take equal rank with 
those denominated severe. What is an industrial accident? 

To formulate the purposes for which statistics are to be 
collected and used is the first step in statistical analysis; 
rigidly and unmistakably to define the units of measurements 
in which aggregates are expressed and to adhere to them 
throughout the process, is the second. The latter is governed 
by the former, as the former is determined by the latter. The 
two are reciprocal. Statistical units cannot be defined out- 
side of the purposes of their employment, nor the purposes 
fully realized without the use of definite and standardized 
units of measurements. 

The foregoing discussion will serve to make clear the dis- 
tinction between the use of mass or frequency concepts in 
pure mathematical calculations and the use of the same con- 
cepts when associated with statistical units. Statistics is 
more than arithmetic. Numerical considerations and pre- 
ponderance of evidence are" the bases for statistical conclu- 
sions, but to arrive at them more than numerical computa- 
tions are involved. It is concerned with the processes and 
methods of formulating and testing conclusions from premises 
resting upon numerical bases. 
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Leaving this more general discussion of the relationship of 
statistical units to the purposes for which they are used, cer- 
tain types of units may be distinguished, and some of their 
peculiarities noted. 

III. TYPES OF STATISTICAL UNITS OF MEASUREMENTS. 

Distinction is drawn between units of enumeration and estima- 
mation and units of analysis and synthesis. The first are those 
by means of which statistics are collected; the second, those 
by means of which statistics are interpreted. The former are 
related more to statistics as numerical facts; the latter, more 
to statistics as methods in the use of these facts. 

1. Units of Enumeration and Estimation. 

Units of enumeration and estimation may conveniently be 
divided into two classes, simple and composite. By simple 
units are meant those in which one determining consideration 
is prescribed. Most statistics of enumeration employ simple 
units, as, for instance, where persons, animals, acres, etc., are 
counted or estimated. In units of this type the conflict be- 
tween identity and use is reduced to a minimum. They are 
simple and this fact normally guarantees against the pres- 
ence of as great a degree of error as is associated with units 
which are composite in character. The unit "farm," for 
instance, for a given purpose, might be easily defined and the 
statistics of "farms" readily understood. When, however, 
the limiting word "improved" is added, both the scope of the 
unit and its application are noticeably restricted. The addi- 
tional element is as subject to error as is the root portion of 
the combined unit. Crops in bushels or in acreage may be 
readily determined; to establish the "normality" of these 
crops, however, raises other problems and calls for superior 
statistical organization and for a much greater exercise of 
judgment. New conditions enter, occasions for error and 
bias crowd in, and it is these to which attention is drawn in 
distinguishing between simple and composite units. 

Moreover, the addition of a limiting word to a simple unit 
may change the meaning which the root carries when used 
alone. For instance, the unit "room," in a survey conducted 
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solely to determine the size of rooms in tenement buildings 
would be defined in such a way as to call for the listing of any 
portion of a house habitually used as a place of abode set off 
by walls with exits either closed or capable of being closed. 
To add to this unit the word "sleeping," suggests so many 
considerations respecting light, ventilation, size in respect 
to number of occupants, and time of occupancy, etc., as ma- 
terially to alter the meaning attached to the unit when the 
counting was undertaken to determine size, but not size in 
connection with use. 

The point which it is sought to emphasize is the fact that 
the identity of a statistical unit is a function of its use. For 
simple units, identity is established by general criteria; for 
composite units, by particular criteria. The more complex 
a unit becomes, the narrower is its application and the greater 
the necessity that its parts be standardized. Crude units 
may suffice for general impressions, but standardized measures 
are necessary for discriminating analysis. This is particularly 
true in cost accounting. ■ Cost units must be reduced to their 
simplest and most elementary form. If composite or com- 
pound units are used, comparisons are likely to be misleading 
and their significance indeterminate. This fact is no less true 
in the use of statistical than in cost data. 

2. Units of Analysis and Synthesis. 

Both simple and composite units become units of analysis 
and synthesis when comparison or the establishment of rela- 
tions follows from their use. Before classifying and discuss- 
ing these, brief attention should be given to comparison and to 
what it implies statistically. 

Comparison must be made between things possessing com- 
mon qualities. These may be of time, of place, or of condi- 
tion. For instance, the accident rate in a given industry may 
be compared before and after the installation of safety devices. 
Comparison may extend to two industries operating at differ- 
ent places or under different conditions, the purpose being 
merely to record a quantitative difference. But comparison 
is rarely made for this alone. Generally, a more or less definite 
purpose of establishing causal connection lies in the back- 
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ground. A specific inquiry is to determine whether phenom- 
ena stand in the relation of cause and effect, or whether they 
are the result of a common cause. 

How nearly economic and business phenomena remain 
homogeneous for any appreciable period, even in an approxi- 
mate sense, is always problematical. The forces affecting 
them are always in a state of flux governed as they are by 
population composition, state of trade, distribution of wealth, 
custom, fad, fashion, prejudice, etc. The whole range of 
human reaction is exhibited in more or less degree. Statis- 
tics under such circumstances often reveal a partial story, are 
not comparable from time to time and from place to place, 
and taken alone constitute a weak and uncertain base upon 
which to build a cause-and-effect structure. 

Since comparison involves the pairing of things or events 
which are not identical in all particulars, a study of cause and 
effect, whether of coincidence or sequence, becomes largely 
a study of association. Causes never operate under exactly 
the same circumstances. Oneness of effect is only apparent, 
variation being evident the moment that the scale of measure- 
ment is reduced. Simply to assume the proviso "other things, 
being equal" is not fully to atone for the sins committed in sta- 
tistical comparisons. The "other things" are rarely if ever 
equal in actual life. Neither economic nor business phenom- 
ena go on indefinitely repeating themselves in one unending 
round of sameness. To expect that an absolute cause will 
always result in an absolute effect or that the "other things" 
will automatically take care of themselves is futile. 

If comparison of economic phenomena is difficult, and the 
assignment of cause and effect rarely if ever absolute, the sta- 
tistical units of measurements, by means of which comparisons 
are made, must be standardized according to use. Statistical 
comparisons involve the use of averages, of coefficients or ratios, 
in which enumerated or estimated numerators are related to 
enumerated or estimated denominators. To assign meaning 
to these without taking the trouble to determine the condi- 
tions which produce them or their suitability to the cases in 
point is as wrong statistically as it is logically to draw a false 
analogy. To do the first is to ignore the existence of deter- 
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mining circumstances; to do the latter to ignore their applica- 
tion. 

The use of averages and coefficients as means of comparison 
suggests the formation of a judgment or a conclusion follow- 
ing from a full consideration of detail which they replace. 
Both represent the culmination of a process of thought and 
when removed from the steps required for their determination 
are likely to be assigned rfew meanings and used for purposes 
foreign to those for which they were designed. Neither should 
be regarded as a " secret something which determines events. " 
They are simply statistical abbreviations into which are crys- 
tallized relations arrived at by logical processes of thought. 
Chronologically, they come late in the process of analysis. 

Coefficients may be classified from two points of view; first, 
as units of interpretation and second, as units of presentation. 
Respecting the first : three subclasses, or more properly, three 
aspects may be distinguished, viz., those of condition, of time, 
and of place. The characteristic features of each subclass 
and the reasons for differentiating the concept in this manner 
may best be shown by means of illustrations. 

(1) Units of Interpretation. By the use of clearly defined 
simple units of measurements, suppose the exact number of 
deaths from infantile paralysis, occurring in a given year, have 
been determined for a given district. The population of the 
same district has also been correctly enumerated or otherwise 
determined. The problem is to express the deaths from this 
cause in the form of a coefficient — to relate them to popula- 
tion. Obviously, the total population is too broad a base, 
since the particular cause of death is common to only a re- 
stricted group of the total. Conditions affecting both numera- 
tor and denominator must be made homogeneous. Similarly, 
industrial accident rates are of little comparative worth unless 
both frequency and severity are related to a standardized oc- 
cupational exposure. If cost coefficients, in the business 
world, are to be significant, comparisons between stock turn- 
overs, for instance, must be made only when classified sales at 
cost or selling price are related to classified stock reduced to 
corresponding bases. Likewise, labor turnover becomes a 
significant coefficient only when a unit of labor displacement is 
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related to a corresponding unit of labor force. Comparisons 
may be general only when the conditions upon which they rest 
have become standardized. 

The distinction which is being emphasized is between crude 
and corrected coefficients. Crude rates are never to be pre- 
ferred when corrected ones are available. Correction consists 
in more accurately defining, measuring and enumerating units 
and in referring phenomena rigidly to the conditions producing 
them. Where this is not done, the amount of error involved 
in comparisons is almost never known, and provision for it 
seldom possible. 

Time and place are also factors of importance in the use of 
coefficients. A comparison of the death rates from malaria 
for the South and North is of little real value. There is little, 
if any, significance in a comparison of the number of miles of 
steam railroads per capita or per one hundred square miles of 
territory for New Jersey and Nevada. Why? The answer is 
clear; because the conditions are so widely different; the same 
phenomena are related to conditions wholly dissimilar or in 
each case of local application. Similarly, comparisons of the 
ratios of the number of bank failures to bank liabilities for the 
period before state and national regulations were inaugurated 
with the present time; of per capita city expenditures or debt 
of the 70's or 80's with 1917, are to a large degree without 
meaning. In the first case, regulation has so changed the 
conditions under which banking is done that there is little in 
common between the earlier and later periods; in the second 
case, the respective domains of public and private initiative 
differ so radically that a consideration of the amount of ex- 
penditure divorced from the benefits accruing from it is with- 
out merit. 

Too great care can not be taken to make comparisons legit- 
imate. This is particularly true in the case of statistical com- 
parisons, since they are numerical and seemingly exact. A 
statistical statement is often taken by the unwary and unini- 
tiated, as sufficient proof of its absoluteness and finality, and 
is made to support predetermined conclusions or premises to 
which it has no relation. Too much faith is placed in the effi- 
cacy of statistics to "prove things." Reasoning from other 
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angles is too frequently dispensed with— if not utterly ignored 
— on the part of the uninformed when "statistics" can be 
utilized, notwithstanding the fact that they may have no ap- 
plication, may be incomplete, unrepresentative, and question- 
able in origin, and that the problem can not be understood by 
an appeal to its numerical side. Loose reasoning and hasty 
judgments are even less defensible when statistics are appealed 
to to support a contention than when they are ignored, for the 
reason that they seem to carry a finality and to suggest a nicety 
of conclusion not generally associated with a less precise 
method of approach. 

(2) Units of Presentation. Coefficients may also be regarded 
from the point of view of units of presentation. This thought 
suggests classification or the art of arranging data into groups 
according to their common characteristics. "Performed 
consciously or unconsciously, the act of classification is indis- 
pensable to and accompanies every scientific inference. A 
mind is orderly or slovenly, according as it does or does not 
habitually and accurately classify the facts with which it comes 
in contact. The success of an investigation, the worth of a 
conclusion, are in direct proportion to the fidelity to this prin- 
ciple and the exhaustiveness with which "the process is carried 
out."* 

Loose thinking, mistaken emphasis, and the assignment of 
cause for effect, or vice versa, result from a denial or a viola- 
tion of this principle. This truth is involved in all that is 
suggested in the term "standardization," and applies no less 
to statistical science than it does to business and economic 
procedure. It is the principle of orderly arrangement, and to 
violate it is as indefensible when dealing with statistical facts 
as when formulating, for instance, systems of cost accounts. 
A cost system which failed to distinguish between overhead 
and material costs could no more be defended than a statistical 
summary which grouped together facts of different properties. 
Combinations must be made on bases that are common, and 
classification must follow lines that are significant. 

It is indispensable, if statistics are to function, to adopt 
those units of presentation which give facts vitality. Sta- 

• Cramer, Frank, The Method of Darwin: A Study in Scientific Method, p. 88. 
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tistics collected, classified and tabulated without a well- 
defined purpose are seldom of much value because of the lack 
of care in their preparation and because of the absence of a 
controlling purpose in their presentation. Too frequently the 
unit groups are so broad, purposeless and indefinite that what- 
ever value the facts may have had as collected, is lost by the 
failure to correlate the method of presentation with the pur- 
pose or function which they are to play. Thus we have death 
rates tabulated by districts so large that correlation of deaths 
with their respective causes in detail is difficult if not impossi- 
ble. From ah administrative point of view, such statistics 
are frequently worthless. Similarly, density of population 
— a common coefficient — becomes meaningless when assigned 
to so large a population and so diverse conditions as those 
comprehended in an entire city. Density as a coefficient is 
significant only where over-crowding is a problem. Again, it 
is of limited significance to know that the great majority of 
wage earners in the United States receive less than, say, $1,200 
a year. What is necessary to know is the distribution and 
wages of those below this limit. The wages of a non-homo- 
geneous class expressed as a total or as an average are of little 
significance in throwing light on such problems as the distri- 
bution of wealth, the basis for arbitration of wage disputes, 
standards for minimum wages, etc. Units for expression are 
generally too broad; the facts are related to conditions which 
are not homogeneous. Statistics in this form become more 
an end than a means to an end, more a goal than a process. 
Too great an expense and insufficient time are the stock 
excuses given for failure to classify and present statistical 
facts in detail. ■ The validity of these common excuses for 
inefficiency and statistical sinning is not always easy of deter- 
mination, but it is clear that it is not money and time which 
constitute our gravest statistical needs, but coSperation, plan- 
ning, correlation of activities, and above all an appreciation 
of the fact that statistics may serve not only as records of 
past achievement but far more significantly as guides for fu- 
ture activities. They find their chief justification in the man- 
ner in which they minister to our positive needs. 
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IV. CONCLUSION. 

Our general conclusions respecting statistical units as stand- 
ards both in definition and application may be summarized as 
follows: Units should be clearly and fully defined in the light 
of the uses which they are to serve, in keeping with the intelli- 
gence of those who are to use them, and in such form that 
overlapping conditions will be readily detected, misunderstand- 
ing difficult and employment specific. They should be rigidly 
referred to the conditions which produce them; should be 
homogeneous with respect to the purposes for which they are 
used, and employed with consistency and integrity. After 
all, in the development and use of statistical units as standards, 
as in all statistical processes, an appreciation of the meaning 
of scientific method and a willingness to be guided by its re- 
quirements are indispensable. If either is lacking, statistics 
and statistical methods are without a logical defense. 



